11 research outputs found
Algoritmo de odometr铆a visual est茅reo para sistemas de ayuda a la conducci贸n : implementaci贸n en GPU mediante CUDA
El objetivo del presente proyecto es el desarrollo e implementaci贸n de un sistema de localizaci贸n para autom贸viles basado en odometr铆a visual, consistente en el seguimiento de puntos caracter铆sticos de la calzada detectados a trav茅s de un sistema de visi贸n est茅reo. Para cumplir con las especificaciones de tiempo de c贸mputo requeridas, se utilizar谩n t茅cnicas de computaci贸n paralela, como la arquitectura CUDA (Compute Unified Device Architecture, arquitectura unificada de dispositivos de c贸mputo), que permite ejecutar procesos en paralelo en unidades GPU (Graphics Processing Unit, unidad de procesamiento gr谩fico) de la marca NVIDIA. El sistema est谩 dise帽ado para formar parte de los sistemas de ayuda a la conducci贸n (adas) equipados en el veh铆culo inteligente ivvi 2.0 de la Universidad Carlos III de Madrid. ___________________________________________________________________________________________________________The objective of this project is the development and implementation of a locating system for cars that is based on visual odometry, consisting in tracking feature points belonging to the road, which are detected by a stereo vision system. To accomplish the computation-time requirements, parallel computing techniques will be used, such as CUDA (Compute Uni ed Device Architecture), which allows to running several processes in parallel on NVIDIA GPUs (Graphics Processing Units). The system is designed to be part of Driver Assistance Systems mounted on the Universidad Carlos III's IVVI 2.0 (Intelligent Vehicle based on Visual Information)Ingenier铆a Industria
Traffic scene awareness for intelligent vehicles using ConvNets and stereo vision
In this paper, we propose an efficient approach to perform recognition and 3D localization of dynamic objects on images from a stereo camera, with the goal of gaining insight into traffic scenes in urban and road environments. We rely on a deep learning framework able to simultaneously identify a broad range of entities, such as vehicles, pedestrians or cyclists, with a frame rate compatible with the strict requirements of onboard automotive applications. Stereo information is later introduced to enrich the knowledge about the objects with geometrical information. The results demonstrate the capabilities of the perception system for a wide variety of situations, thus providing valuable information for a higher-level understanding of the traffic situation
Fast Joint Object Detection and Viewpoint Estimation for Traffic Scene Understanding
Environment perception is a critical enabler for automated driving systems since it allows a comprehensive understanding of traffic situations, which is a requirement to ensure safe and reliable operation. Among the different applications, obstacle identification is a primary module of the perception system. We propose a vision-based method built upon a deep convolutional neural network that can reason simultaneously about the location of objects in the image and their orientations on the ground plane. The same set of convolutional layers is used for the different tasks involved, avoiding the repetition of computations over the same image. Experiments on the KITTI dataset show that our efficiency-oriented method achieves state-of-the-art accuracies for object detection and viewpoint estimation, and is particularly suitable for the recognition of traffic situations from on-board vision systems. Code is available at https://github.com/cguindel/Isi-faster-renn
Analysis of the Influence of Training Data on Road User Detection
In this paper, we discuss the relevance of training data on modern object detectors used on onboard applications. Whereas modern deep learning techniques require large amounts of data, datasets with typical scenarios for autonomous vehicles are scarce and have a reduced number of samples. We conduct a comprehensive set of experiments to understand the effect of using a combination of two relatively small datasets to train an end-to-end object detector, based on the popular Faster R-CNN and enhanced with orientation estimation capabilities. We also test the adequacy of training models using partially available ground-truth labels, as a consequence of combining datasets aimed at different applications. Data augmentation is also introduced into the training pipeline. Results show a significant performance improvement in our exemplary case as a result of the higher variability of the training samples, thus opening a new way to improve the detection performance independently from the detector architecture.This work was supported by the Spanish Government through the CICYT projects TRA2015-63708-R and TRA2016-78886-C3-1-R, and the Comunidad de Madrid through SEGVAUTO-TRIES (S2013/MIT-2713).Publicad
Ehmi: Review and guidelines for deployment on autonomous vehicles
Human-machine interaction is an active area of research due to the rapid development of autonomous systems and the need for communication. This review provides further insight into the specific issue of the information flow between pedestrians and automated vehicles by evaluating recent advances in external human-machine interfaces (eHMI), which enable the transmission of state and intent information from the vehicle to the rest of the traffic participants. Recent developments will be explored and studies analyzing their effectiveness based on pedestrian feedback data will be presented and contextualized. As a result, we aim to draw a broad perspective on the current status and recent techniques for eHMI and some guidelines that will encourage future research and development of these systems
Study of the Effect of Exploiting 3D Semantic Segmentation in LiDAR Odometry
This article belongs to the Special Issue Intelligent Transportation SystemsThis paper presents a study of how the performance of LiDAR odometry is affected by the preprocessing of the point cloud through the use of 3D semantic segmentation. The study analyzed the estimated trajectories when the semantic information is exploited to filter the original raw data. Different filtering configurations were tested: raw (original point cloud), dynamic (dynamic obstacles are removed from the point cloud), dynamic vehicles (vehicles are removed), far (distant points are removed), ground (the points belonging to the ground are removed) and structure (only structures and objects are kept in the point cloud). The experiments were performed using the KITTI and SemanticKITTI datasets, which feature different scenarios that allowed identifying the implications and relevance of each element of the environment in LiDAR odometry algorithms. The conclusions obtained from this work are of special relevance for improving the efficiency of LiDAR odometry algorithms in all kinds of scenarios.Research was supported by the Spanish Government through the CICYT projects (TRA2016-78886-C3-1-R and RTI2018-096036-B-C21) and the Comunidad de Madrid through SEGVAUTO-4.0-CM (P2018/EMT-4362) and PEAVAUTO-CM-UC3M
High-accuracy patternless calibration of multiple 3D LiDARs for autonomous vehicles
This article proposes a new method for estimating the extrinsic calibration parameters between any pair of multibeam LiDAR sensors on a vehicle. Unlike many state-of-the-art works, this method does not use any calibration pattern or reflective marks placed in the environment to perform the calibration; in addition, the sensors do not need to have overlapping fields of view. An iterative closest point (ICP)-based process is used to determine the values of the calibration parameters, resulting in better convergence and improved accuracy. Furthermore, a setup based on the car learning to act (CARLA) simulator is introduced to evaluate the approach, enabling quantitative assessment with ground-truth data. The results show an accuracy comparable with other approaches that require more complex procedures and have a more restricted range of applicable setups. This work also provides qualitative results on a real setup, where the alignment between the different point clouds can be visually checked. The open-source code is available at https://github.com/midemig/pcd_calib .This work was supported in part by the Madrid Government (Comunidad de Madrid-Spain) under the Multiannual Agreement with UC3M ("Fostering Young Doctors Research," APBI-CM-UC3M) in the context of the V PRICIT (Research and Technological Innovation Regional Program); and in part by the Spanish Government through Grants ID2021-128327OA-I00 and TED2021-129374A-I00 funded by MCIN/AEI/10.13039/501100011033 and by the European Union NextGenerationEU/PRTR
BirdNet+: two-stage 3D object detection in LiDAR through a sparsity-invariant bird's eye view
Autonomous navigation relies upon an accurate understanding of the elements in the surroundings. Among the different on-board perception tasks, 3D object detection allows the identification of dynamic objects that cannot be registered by maps, being key for safe navigation. Thus, it often requires the use of LiDAR data, which is able to faithfully represent the scene geometry. However, although raw laser point clouds contain rich features to perform object detection, more compact representations such as the bird's eye view (BEV) projection are usually preferred in order to meet the time requirements of the control loop. This paper presents an end-to-end object detection network based on the well-known Faster R-CNN architecture that uses BEV images as input to produce the final 3D boxes. Our regression branches can infer not only the axis-aligned bounding boxes but also the rotation angle, height, and elevation of the objects in the scene. The proposed network provides state-of-the-art results for car, pedestrian, and cyclist detection with a single forward pass when evaluated on the KITTI 3D Object Detection Benchmark, with an accuracy that exceeds 64% mAP 3D for the Moderate difficulty. Further experiments on the challenging nuScenes dataset show the generalizability of both the method and the proposed BEV representation against different LiDAR devices and across a wider set of object categories by being able to reach more than 30% mAP with a single LiDAR sweep and almost 40% mAP with the usual 10-sweep accumulation.This work was supported in part by the Government of Madrid (Comunidad de Madrid) under the Multiannual Agreement with the University Carlos III of Madrid (UC3M) in the line of "Fostering Young Doctors Research"(PEAVAUTO-CM-UC3M), and in part by the Context of the V Regional Programme of Research and Technological Innovation (PRICIT)
A method for synthetic LiDAR generation to create annotated datasets for autonomous vehicles perception
Proceedings of: 2019 IEEE Intelligent Transportation Systems Conference (ITSC)LiDAR devices have become a key sensor for autonomous vehicles perception due to their ability to capture reliable geometry information. Indeed, approaches processing LiDAR data have shown an impressive accuracy for 3D object detection tasks, outperforming methods solely based on image inputs. However, the wide diversity of on-board sensor configurations makes the deployment of published algorithms into real platforms a hard task, due to the scarcity of annotated datasets containing laser scans. We present a method to generate new point clouds datasets as captured by a real LiDAR device. The proposed pipeline makes use of multiple frames to perform an accurate 3D reconstruction of the scene in the spherical coordinates system that enables the simulation of the sweeps of a virtual LiDAR sensor, configurable both in location and inner specifications. The similarity between real data and the generated synthetic clouds is assessed through a set of experiments performed using KITTI Depth and Object Benchmarks.Research supported by the Spanish Government through the CICYT projects (TRA2016-78886-C3-1-R and RTI2018-096036-B-C21), and the Comunidad de Madrid through SEGVAUTO-4.0-CM (P2018/EMT-4362). We gratefully acknowledge the support of NVIDIA Corporation with the donation of the GPUs used for this research
Convolutional neural networks for joint object detection and pose estimation in traffic scenes
Menci贸n Internacional en el t铆tulo de doctorFew any longer question that autonomous vehicles will be a key
element of transportation in the coming decades. Reliable perception
of the surroundings of the vehicle is today one of the remaining
technical challenges that must be addressed to ensure safe autonomous
navigation, especially in crowded environments. This functionality
usually relies on onboard sensors, which provide data that must be
appropriately processed.
Among the different tasks assigned to the perception suite of an
automated vehicle, the detection of other road users that can potentially
interfere with the trajectory of the vehicle is particularly critical.
However, the identification of agents in sensor data is only the first
step. Planning and control modules down the pipeline demand trustworthy
information about how the objects are arranged in space. In
particular, their orientation and location on the road plane are usually
attributes of utmost importance to build a purposeful model of the
environment.
This thesis aims to provide close-to-market solutions to these issues
taking advantage of the dramatic breakthrough seen in deep neural
networks in the past decade. The methods proposed in this thesis
are built on top of a popular detection framework, Faster R-CNN,
which features high detection accuracy at near real-time rates. Some
proposals to enhance the performance of the algorithm in images
obtained from onboard cameras are introduced and discussed.
One of the central contributions of the thesis is the extension of the
Faster R-CNN framework to estimate the orientation of the detected
objects based exclusively on appearance information, which makes
the method robust against the different sources of error present in
traffic environments. As a natural next step, two algorithms exploiting
this functionality to perform 3D object localization are proposed. As
a result, the combination of the methods described throughout this
thesis leads to a procedure able to provide situational awareness of
the potential hazards in the surroundings of the vehicle.
All the proposed methods are analyzed and validated through
systematic experimentation using a well-recognized public dataset (the
KITTI Vision Benchmark Suite), where notable results were obtained.
The viability of the implementation of the solutions in a real vehicle is
also discussed.Pocos cuestionan ya que los veh铆culos aut贸nomos ser谩n un elemento
clave del transporte en las pr贸ximas d茅cadas. La percepci贸n fiable
del entorno del veh铆culo es, hoy en d铆a, uno de los retos t茅cnicos que
hay que afrontar para garantizar una navegaci贸n aut贸noma segura,
especialmente en entornos con muchos agentes. Esta funcionalidad se
basa, normalmente, en sensores embarcados, que proporcionan datos
que deben ser procesados de forma adecuada.
Entre las diferentes tareas asignadas al sistema de percepci贸n de
un veh铆culo automatizado, la detecci贸n de otros usuarios de la v铆a
que puedan interferir potencialmente con la trayectoria del veh铆culo es
particularmente cr铆tica. Sin embargo, la identificaci贸n de los agentes
en los datos de los sensores es s贸lo el primer paso. Los m贸dulos de
planificaci贸n y control del veh铆culo exigen informaci贸n fiable sobre la
disposici贸n de los objetos en el espacio. En particular, su orientaci贸n
y ubicaci贸n en el plano de la carretera suelen ser atributos de suma
importancia para construir un modelo del entorno significativo.
Esta tesis tiene como objetivo proporcionar soluciones comercialmente
viables para estos problemas, aprovechando el impresionante
avance que han experimentado las redes neuronales profundas en
la 煤ltima d茅cada. Los m茅todos propuestos en esta tesis se basan en
un marco de detecci贸n popular, Faster R-CNN, que ofrece una alta
precisi贸n de detecci贸n a velocidades cercanas al tiempo real. As铆, se
presentan y discuten algunas propuestas para mejorar el rendimiento
del algoritmo en las im谩genes obtenidas de las c谩maras a bordo.
Una de las aportaciones centrales de la tesis es la ampliaci贸n de la
arquitectura Faster R-CNN para estimar la orientaci贸n de los objetos
detectados bas谩ndose exclusivamente en la informaci贸n de apariencia,
lo que hace que el m茅todo sea robusto frente a las diferentes fuentes de
error presentes en los entornos de tr谩fico. Como siguente paso natural,
se proponen dos algoritmos que aprovechan esta funcionalidad para
realizar la localizaci贸n de objetos en 3D. Como resultado, la combinaci贸n
de los m茅todos descritos a lo largo de esta tesis permite construir
un procedimiento capaz de proporcionar conciencia situacional de los
peligros potenciales en los alrededores del veh铆culo.
Todos los m茅todos propuestos son analizados y validados mediante
experimentaci贸n sistem谩tica utilizando una reconocida base de datos
p煤blica (KITTI Vision Benchmark Suite), donde se han obtenido resultados
notables. Tambi茅n se discute la viabilidad de la implementaci贸n
de las soluciones en un veh铆culo real.Programa de Doctorado en Ingenier铆a El茅ctrica, Electr贸nica y Autom谩tica por la Universidad Carlos III de MadridPresidente: Felipe Jim茅nez Alonso.- Secretario: Basam Musleh Lancis.- Vocal: Eduardo Jos茅 Molinos Vicent